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ABSTRACT 

A novel methodology for intelligent music production has 
been developed using evolutionary computation. Mixes are 
generated by exploration of a “mix-space”, which consists 
of a series of inter-channel volume ratios, allowing efficient 
generation of random mixes. An interactive genetic algo- 
rithm was used, allowing the user to rate mixes and guide 
the system towards their ideal mix. Currently, fitness eval- 
uation is subjective but can be aided by specific domain 
knowledge obtained from a large-scale study of real mixes. 

1. BACKGROUND 

Intelligent music production (IMP) has been an active re- 
search topic for over a decade. One aim is the development 
of systems which perform common tasks: level-balancing, 
equalisation, panning, dynamic range compression and ap- 
plication of artificial reverberation. Many previous IMP 
systems developed were modelled as expert systems wherein 
a music production task is solved by optimisation, and do- 
main knowledge, obtained by examining industry “best- 
practice” methods, is used to determine the optimisation 
target [Tj. Drawbacks to this method include the fallibil- 
ity of this type of domain knowledge and the fundamen- 
tal assumption that there is a global optimum, i.e. one mix 
which all users would agree is best. Subjective evaluation 
suggested that existing systems struggled to compete with 
human-made mixes J2 |, perhaps due to a lack of what we 
would perceive as creativity. Additionally it has been sug- 
gested that mix engineers prefer their own mix to those of 
their peers |2 |. Consequently, IMP tools would benefit from 
increased interactivity and subjectivity, to determine user- 
specific “personal” global optima in the solution space, in- 
stead of a single “universal” global optimum. 

2. CONCEPT 

We propose to use interactive evolutionary computation 
(IEC) to solve this problem, being well-suited to aesthetic 
design problems which are non-linear and non-deterministic 
|3j. The flowchart in Fig. [T| demonstrates the method, with 
an interactive genetic algorithm (IGA). The solution space 
we explored is a “mix-space” which theoretically represents 
all the mixes that it is possible to create with a finite set of 
tools 0 - For level-balancing, the gains g of all n tracks are 
selected from a unit hypersphere in W . This hypersurface 
has n— 1 dimensions, representing a series of inter-channel 





Figure 1 : Flowchart of intelligent mixer using IGA 
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Figure 2: k-means with cosine distance metric (spherical k- 
means), clustered in gain-space, for a simple 3-track mixing 
task. The population size is 1000 (deliberately large, for 
visualisation purposes) and the number of clusters is 5. 

volume ratios, <t> (see Fig. |2j. This method has the advan- 
tage that all random mixes generated are unique and have 
equal loudness (after normalising the loudness of tracks be- 
forehand). The fitness function for optimisation is subjec- 
tive, allowing mixes to be generated based on any percep- 
tual description, such as “warmth”, “punch” or “clarity” or 
simply “preference”. 

3. METHOD 

EC typically requires a large population of candidate solu- 
tions. To increase the population size beyond that which a 
user could realistically evaluate, before becoming fatigued. 
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the fitness of a rated sub-population is extrapolated to nearby 
solutions |5j . Figure [2] shows the population clustered into 
c clusters. The mixes closest to each cluster centroid are 
chosen for audition and user-evaluation. 

To aid this extrapolation we introduce findings from a 
recent large-scale study of music mixes which revealed tol- 
erance ranges for low-level audio features [6). This can be 
used to augment the fitness of the population alongside the 
subjective ratings provided to a subset of the population, ef- 
fectively adding a penalty to mixes which are unlikely to 
be created by a real engineer, while still giving the user the 
authority to override these heuristics. 

While clustering is performed in the gain-space, genetic 
operations take place in the mix-space. Currently, the sys- 
tem uses roulette selection and uniform crossover with mu- 
tation. These operations could also be performed in the 
gain-space if solved on the sphere. 

Typically, in EC, the optimal solution is considered to be 
the solution with the highest fitness. However, many prob- 
lems that can be addressed by IEC are perceptual and as 
such do not require exact solutions but rather seek to iden- 
tify an area of the solution space in which many fit solutions 
exist which are perceptually similar 1 3 1] . In a music mixing 
problem there is a limit to the precision required when de- 
termining gain values, as small adjustments in the gain of 
individual tracks will not be perceived. 

Determining the region of optimal solutions employed 
kernel density estimation (KDE). Figure [3] shows the uni- 
variate KDE result, with the values of (j) having evolved to- 
wards specific modal values. These values are converted 
back to gain-space in order to construct the final mix. 
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Figure 3: Kernel density estimation, showing modes in mix- 
space for a 6-track mixing session. The position of each 
mode is highlighted along with the density value. These 
values of are transformed to g i. . „ to create the final 

mix. Note, that in this example, multiple optimal mixes are 
possible, due to the multi-modal nature of (f> 4 . 


4. CONCLUSIONS 

Early results indicate that the system can produce a vari- 
ety of mixes, suited to varying personal taste. As this sys- 
tem makes minimal assumptions as to what makes a good 
mix, or possibly no assumptions, it learns from the expertise 
of the user, rather than the traditional approach, which as- 
sumes the novice user learns from the expert system. We be- 
lieve this approach can be used to further expand the study 
of IMP, to deliver personalised object-based audio to con- 
sumers and to increase the understanding how music is mixed. 
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